Skip to content

REGR: Assigning label with registered EA dtype raises #38427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 14, 2020

Conversation

rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented Dec 12, 2020

Previous behavior in io.parsers depended on is_bool_dtype("boolean") erroneously raising AttributeError.

Test currently iterates over the names:

['category', 'interval', 'boolean', 'Float32', 'Float64', 'Int8', 'Int16', 'Int32', 'Int64', 'UInt8', 'UInt16', 'UInt32', 'UInt64', 'string']

It skips any EA where the name is a property; I don't see a good way to iterate over these in the test.

I only added a note about is_bool_dtype in the whatsnew as the DataFrame.__setitem__ issue does not occur on 1.1.x. It didn't seem appropriate for any section besides other.

@rhshadrach rhshadrach changed the title Is boolean REGR: Assigning label with registered EA dtype raises Dec 12, 2020
@rhshadrach rhshadrach added Dtype Conversions Unexpected or buggy dtype conversions Regression Functionality that used to work in a prior pandas version labels Dec 12, 2020
@rhshadrach rhshadrach added this to the 1.2 milestone Dec 12, 2020
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I only added a note about is_bool_dtype in the whatsnew as the DataFrame.setitem issue does not occur on 1.1.x. It didn't seem appropriate for any section besides other.

Indeed, if it's only a regression in master / 1.2rc, no need for a whatsnew

# property would require instantiation
if not isinstance(dtype.name, property)
],
*["datetime64[ns, UTC]", "period[D]"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I find this ridiculous. And not you, but mypy ;) (since when are lists supposed to be of homogeneous content?)
If mypy can't deal with it, I would just add a ignore comment, instead of the unpacking into a list

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -856,7 +856,7 @@ Other
- Bug in :meth:`Index.drop` raising ``InvalidIndexError`` when index has duplicates (:issue:`38051`)
- Bug in :meth:`RangeIndex.difference` returning :class:`Int64Index` in some cases where it should return :class:`RangeIndex` (:issue:`38028`)
- Fixed bug in :func:`assert_series_equal` when comparing a datetime-like array with an equivalent non extension dtype array (:issue:`37609`)

- Bug in :func:`.is_bool_dtype` would raise when passed a valid string such as ``"boolean"`` (:issue:`38386`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo extra period before is_bool_dtype

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The period before is_bool_dtype allows sphinx link to the proper page; without it, a link will not be made. However, the period after the issue number is a typo.

@@ -1397,7 +1397,7 @@ def is_bool_dtype(arr_or_dtype) -> bool:
# guess this
return arr_or_dtype.is_object and arr_or_dtype.inferred_type == "boolean"
elif is_extension_array_dtype(arr_or_dtype):
return getattr(arr_or_dtype, "dtype", arr_or_dtype)._is_boolean
return getattr(dtype, "_is_boolean", False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment that this is de facto an isinstance(dtype, BooleanDtype)?

Copy link
Member Author

@rhshadrach rhshadrach Dec 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might also return True for certain sparse and categorical.

cast_type
) or is_extension_array_dtype(cast_type)
is_ea = is_extension_array_dtype(cast_type)
is_str_or_ea_dtype = is_string_dtype(cast_type) or is_ea
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put is_ea first to short-circuit the other check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1707,11 +1706,7 @@ def _convert_to_ndarrays(
or is_extension_array_dtype(cast_type)
):
try:
if (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the if go outside the try instead of the other way around?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jreback jreback merged commit e3443e3 into pandas-dev:master Dec 14, 2020
@jreback
Copy link
Contributor

jreback commented Dec 14, 2020

thanks @rhshadrach

@jreback
Copy link
Contributor

jreback commented Dec 14, 2020

@meeseeksdev backport 1.2.x

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Dec 14, 2020
jreback pushed a commit that referenced this pull request Dec 14, 2020
@rhshadrach rhshadrach deleted the is_boolean branch January 2, 2021 14:16
luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

REGR: assigning column with name that is registered as extension dtype errors
4 participants